208 research outputs found

    Post-transcriptional knowledge in pathway analysis increases the accuracy of phenotypes classification

    Get PDF
    Motivation: Prediction of phenotypes from high-dimensional data is a crucial task in precision biology and medicine. Many technologies employ genomic biomarkers to characterize phenotypes. However, such elements are not sufficient to explain the underlying biology. To improve this, pathway analysis techniques have been proposed. Nevertheless, such methods have shown lack of accuracy in phenotypes classification. Results: Here we propose a novel methodology called MITHrIL (Mirna enrIched paTHway Impact anaLysis) for the analysis of signaling pathways, which has built on top of the work of Tarca et al., 2009. MITHrIL extends pathways by adding missing regulatory elements, such as microRNAs, and their interactions with genes. The method takes as input the expression values of genes and/or microRNAs and returns a list of pathways sorted according to their deregulation degree, together with the corresponding statistical significance (p-values). Our analysis shows that MITHrIL outperforms its competitors even in the worst case. In addition, our method is able to correctly classify sets of tumor samples drawn from TCGA. Availability: MITHrIL is freely available at the following URL: http://alpha.dmi.unict.it/mithril

    Mining genetic, transcriptomic, and imaging data in Parkinson’s disease

    Get PDF
    Parkinson’s disease (PD) is a brain disorder that leads to shaking, stiffness and difficulties with walking, balance, and coordination. Affected people may also have mental and behavioral changes, sleep problems, depression, memory difficulties and fatigue. PD is an age-related disease, with an increased prevalence in populations of subjects over the age of 60. About 5 to 10% of PD patients have an "early-onset" variant and it is often, but not always, inherited. PD is characterized by the loss of groups of neurons involved in the control of voluntary movements. Here we present a novel imaging-genetics workflow on Parkinson’s disease aimed to discover some new potential candidate biomarkers for Parkinson’s disease onset, by interpolating genotyping, transcriptomic, functional (Dopamine Transporter Scan) and morphological (Magnetic Resonance Imaging) imaging data. The proposed tutorial has the aim to encourage and stimulate the attendees on the biomedical research with the advantage of integration of heterogenous data. In the last decade the use of images together with genetics data has become widespread among the bioinformatics researchers. This has allowed to inspect and investigate in detail different specific diseases, to better understand their origin and cause. While in recent years many imaging genetics analyses have been developed and successfully applied to characterize brain functioning and neurodegenerative diseases such as Alzheimer’s disease, to our knowledge, no standard imaging genetics workflow has been proposed for PD. The novelty of our workflow can be summarized as follows: • We propose a domain free and easy-to-use workflow, integrating heterogenous data, such as genotyping, transcriptomic, and imaging data. • The workflow addresses the complexity of integrating real multi-source data when a limited number of data are available by proposing three step-based method, where the first step integrates genotyping and imaging features considering each feature individually, the second step summarizes imaging features in a single measure, and the last step focuses on linking potential functional effects caused by the biomarkers found during the two previous phases. • We propose a validation of the method on genetic and imaging data related to PD, showing our new results. The data used for this tutorial were obtained from the Parkinson’s Progression Marker Initiative (PPMI) data portal. Currently, PPMI is the most complete and comprehensive collection of PD-related data. The dataset that will be used in the tutorial consists in a set of polymorphisms, more specifically insertions and deletions (indels) or Single Nucleotide Polymorphisms (SNPs), and transcriptomic data retrieved by RNA sequencing. In addition, DaTSCAN and MRI data are used, which have been shown to be effective in providing potential biomarkers for PD onset and progression. The attendees will acquire an experience on how to conduct a complete imaging-genetics workflow, in a specific case study of Parkinsonian subjects. After the tutorial session the attendees will be able to conduct themselves an imaging-genetics pipeline, which could also be applied to study other neurological diseases. The tutorial will introduce the partecipants to the biological background, especially with the notion of DNA, RNA, Single-nucleotide polymorphism (SNP) and Genome-Wide Association Study (GWAS). The participants will have the opportunity to get familiar with PLINK, a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyzes in a computationally efficient manner. It provides a large range of functionalities designed for data management, summary statistics, quality control, population stratification detection, association analysis, etc. for genotyping data analysis. The audience will also learn how to run code on the widely used R programming environment for statistical computing and graphics. They will also learn some notions about Python, especially how to deal efficiently, with genotyping data using Pandas library, which was designed for data manipulation and analysis. The tutorial code is wrapped in different Jupyter notebooks (formerly IPython Notebooks), that is a web-based and system-independent interactive computational environment for easy analysis reproducibility

    GRAPES-DD: exploiting decision diagrams for index-driven search in biological graph databases

    Get PDF
    BACKGROUND: Graphs are mathematical structures widely used for expressing relationships among elements when representing biomedical and biological information. On top of these representations, several analyses are performed. A common task is the search of one substructure within one graph, called target. The problem is referred to as one-to-one subgraph search, and it is known to be NP-complete. Heuristics and indexing techniques can be applied to facilitate the search. Indexing techniques are also exploited in the context of searching in a collection of target graphs, referred to as one-to-many subgraph problem. Filter-and-verification methods that use indexing approaches provide a fast pruning of target graphs or parts of them that do not contain the query. The expensive verification phase is then performed only on the subset of promising targets. Indexing strategies extract graph features at a sufficient granularity level for performing a powerful filtering step. Features are memorized in data structures allowing an efficient access. Indexing size, querying time and filtering power are key points for the development of efficient subgraph searching solutions.RESULTS: An existing approach, GRAPES, has been shown to have good performance in terms of speed-up for both one-to-one and one-to-many cases. However, it suffers in the size of the built index. For this reason, we propose GRAPES-DD, a modified version of GRAPES in which the indexing structure has been replaced with a Decision Diagram. Decision Diagrams are a broad class of data structures widely used to encode and manipulate functions efficiently. Experiments on biomedical structures and synthetic graphs have confirmed our expectation showing that GRAPES-DD has substantially reduced the memory utilization compared to GRAPES without worsening the searching time.CONCLUSION: The use of Decision Diagrams for searching in biochemical and biological graphs is completely new and potentially promising thanks to their ability to encode compactly sets by exploiting their structure and regularity, and to manipulate entire sets of elements at once, instead of exploring each single element explicitly. Search strategies based on Decision Diagram makes the indexing for biochemical graphs, and not only, more affordable allowing us to potentially deal with huge and ever growing collections of biochemical and biological structures

    A SystemC Platform for Signal Transduction Modelling and Simulation in Systems Biology

    Get PDF
    Signal transduction is a class of cell\u2019s biological processes,which are commonly represented as highly concurrent reactive systems. In the Systems Biology community, modelling and simulation of signal transduction require overcoming issues like discrete event-based execution of complex systems, description from building blocks through composition and encapsulation, description at different levels of granularity, methods for abstraction and refinement. This paper presents a signal transduction modelling and simulationplatform based on SystemC, and shows how the platform allows handling the system complexity by modelling it at different abstraction levels. The paper reports the results obtained by applying the platform to model the intracellular signalling network controlling integrin activation mediating leukocyte recruitment from the blood into the tissues. The dynamic simulation of the model has been conducted with the aim of exploring oscillating behaviors of such a biochemical circuit and, more in general, to help better understanding properties of the overall dynamics of leukocyte recruitment

    A subgraph isomorphism algorithm and its application to biochemical data

    Get PDF
    BackgroundGraphs can represent biological networks at the molecular, protein, or species level. An important query is to find all matches of a pattern graph to a target graph. Accomplishing this is inherently difficult (NP-complete) and the efficiency of heuristic algorithms for the problem may depend upon the input graphs. The common aim of existing algorithms is to eliminate unsuccessful mappings as early as and as inexpensively as possible.ResultsWe propose a new subgraph isomorphism algorithm which applies a search strategy to significantly reduce the search space without using any complex pruning rules or domain reduction procedures. We compare our method with the most recent and efficient subgraph isomorphism algorithms (VFlib, LAD, and our C++ implementation of FocusSearch which was originally distributed in Modula2) on synthetic, molecules, and interaction networks data. We show a significant reduction in the running time of our approach compared with these other excellent methods and show that our algorithm scales well as memory demands increase.ConclusionsSubgraph isomorphism algorithms are intensively used by biochemical tools. Our analysis gives a comprehensive comparison of different software approaches to subgraph isomorphism highlighting their weaknesses and strengths. This will help researchers make a rational choice among methods depending on their application. We also distribute an open-source package including our system and our own C++ implementation of FocusSearch together with all the used datasets (http://ferrolab.dmi.unict.it/ri.html). In future work, our findings may be extended to approximate subgraph isomorphism algorithms

    A subgraph isomorphism algorithm and its application to biochemical data

    Get PDF
    BackgroundGraphs can represent biological networks at the molecular, protein, or species level. An important query is to find all matches of a pattern graph to a target graph. Accomplishing this is inherently difficult (NP-complete) and the efficiency of heuristic algorithms for the problem may depend upon the input graphs. The common aim of existing algorithms is to eliminate unsuccessful mappings as early as and as inexpensively as possible.ResultsWe propose a new subgraph isomorphism algorithm which applies a search strategy to significantly reduce the search space without using any complex pruning rules or domain reduction procedures. We compare our method with the most recent and efficient subgraph isomorphism algorithms (VFlib, LAD, and our C++ implementation of FocusSearch which was originally distributed in Modula2) on synthetic, molecules, and interaction networks data. We show a significant reduction in the running time of our approach compared with these other excellent methods and show that our algorithm scales well as memory demands increase.ConclusionsSubgraph isomorphism algorithms are intensively used by biochemical tools. Our analysis gives a comprehensive comparison of different software approaches to subgraph isomorphism highlighting their weaknesses and strengths. This will help researchers make a rational choice among methods depending on their application. We also distribute an open-source package including our system and our own C++ implementation of FocusSearch together with all the used datasets (http://ferrolab.dmi.unict.it/ri.html). In future work, our findings may be extended to approximate subgraph isomorphism algorithms

    Comprehensive reconstruction and visualization of non-coding regulatory networks in human

    Get PDF
    Research attention has been powered to understand the functional roles of non-coding RNAs (ncRNAs). Many studies have demonstrated their deregulation in cancer and other human disorders. ncRNAs are also present in extracellular human body fluids such as serum and plasma, giving them a great potential as non-invasive biomarkers. However, non-coding RNAs have been relatively recently discovered and a comprehensive database including all of them is still missing. Reconstructing and visualizing the network of ncRNAs interactions are important steps to understand their regulatory mechanism in complex systems. This work presents ncRNA-DB, a NoSQL database that integrates ncRNAs data interactions from a large number of well established online repositories. The interactions involve RNA, DNA, proteins and diseases. ncRNA-DB is available at http://ncrnadb.scienze.univr.it/ncrnadb/. It is equipped with three interfaces: web based, command line and a Cytoscape app called ncINetView. By accessing only one resource, users can search for ncRNAs and their interactions, build a network annotated with all known ncRNAs and associated diseases, and use all visual and mining features available in Cytoscape

    Hypoxia and extracellular vesicles: A review on methods, vesicular cargo and functions

    Get PDF
    Hypoxia is an essential hallmark of several serious diseases such as cardiovascular and metabolic disorders and cancer. A decline in the tissue oxygen level induces hypoxic responses in cells which strive to adapt to the changed conditions. A failure to adapt to prolonged or severe hypoxia can trigger cell death. While some cell types, such as neurons, are highly vulnerable to hypoxia, cancer cells take advantage of a hypoxic environment to undergo tumour growth, angiogenesis and metastasis. Hypoxia-induced processes trigger complex intercellular communication and there are now indications that extracellular vesicles (EVs) play a fundamental role in these processes. Recent developments in EV isolation and characterization methodology have increased the awareness of the importance of EV purity in functional and cargo studies. Cell death, a hallmark of severe hypoxia, is a known source of intracellular contaminants in isolated EVs. In this review, methodological aspects of studies investigating hypoxia-induced EVs are critically evaluated. Key concerns and gaps in the current knowledge are highlighted and future directions for studies are set. To accelerate and advance research, an in-depth analysis of the functions and cargo of hypoxic EVs, compared to normoxic EVs, is provided with the focus on the altered microRNA contents of the EVs

    APPAGATO: an APproximate PArallel and stochastic GrAph querying TOol for biological networks

    Get PDF
    Motivation: Biological network querying is a problem requiring a considerable computational effort tobe solved. Given a target and a query network, it aims to find occurrences of the query in the target byconsidering topological and node similarities (i.e. mismatches between nodes, edges, or node labels).Querying tools that deal with similarities are crucial in biological network analysis since they providemeaningful results also in case of noisy data. In addition, since the size of available networks increasessteadily, existing algorithms and tools are becoming unsuitable. This is rising new challenges for the designof more efficient and accurate solutions.Results: This paper presents APPAGATO, a stochastic and parallel algorithm to find approximateoccurrences of a query network in biological networks. APPAGATO handles node, edge, and node labelmismatches. Thanks to its randomic and parallel nature, it applies to large networks and, compared toexisting tools, it provides higher performance as well as statistically significant more accurate results.Tests have been performed on protein-protein interaction networks annotated with synthetic and real geneontology terms. Case studies have been done by querying protein complexes among different species andtissue
    corecore